{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# hvPlot.hist\n",
    "\n",
    "```{eval-rst}\n",
    ".. currentmodule:: hvplot\n",
    "\n",
    ".. automethod:: hvPlot.hist\n",
    "```\n",
    "\n",
    "## Backend-specific styling options\n",
    "\n",
    "```{eval-rst}\n",
    ".. backend-styling-options:: hist\n",
    "```\n",
    "\n",
    "## Examples\n",
    "\n",
    "Histograms are used to approximate the distribution of continuous data by dividing the range into bins and counting the number of observations in each bin."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Basic histogram plot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})\n",
    "\n",
    "df.hvplot.hist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Basic histogram with bins\n",
    "\n",
    "You can control the number of bins by setting it with an integer value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})\n",
    "\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300,\n",
    "    bins=30, title='Histogram (bins=30)'\n",
    ") +\\\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300, shared_axes=False,\n",
    "    bins=50, title='Histogram (bins=50)'\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or with a string value referencing one of the values accepted by the `bins` keyword of [`np.histogram_bin_edges`](https://numpy.org/doc/stable/reference/generated/numpy.histogram_bin_edges.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})\n",
    "\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300,\n",
    "    bins='auto', title='Histogram (bins=\"auto\")'\n",
    ") +\\\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300, shared_axes=False,\n",
    "    bins='scott', title='Histogram (bins=\"scott\")'\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or with a list or 1D Numpy array of edges."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})\n",
    "\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300,\n",
    "    bins=[-3+0.5*i for i in range(12)], title='Histogram (bins as a list)'\n",
    ") +\\\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300, shared_axes=False,\n",
    "    bins=np.arange(-3, 3, 0.25), title='Histogram (bins as a numpy array)'\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Histogram with `bin_range`\n",
    "\n",
    "This limits the histogram to a specific range of values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})\n",
    "\n",
    "df.hvplot.hist(y='values', bin_range=(-2, 2), title='Histogram in range -2 to 2')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Normalized histogram\n",
    "\n",
    "You can normalize the histogram with `normed=True` or `normed='integral'` to show density instead of raw counts. If `normed='height'`, then the frequencies are normalized such that the max bin height is unity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})\n",
    "\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300,\n",
    "    normed=True, title='Normalize hist (normed=True)'\n",
    ") +\\\n",
    "df.hvplot.hist(\n",
    "    y='values', width=300, shared_axes=False,\n",
    "    normed='height', title='Normalized hist (normed=\"height\")'\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{note}\n",
    "`normed=True` is equivalent to `density=True` in `np.histogram`.\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Cumulative histogram\n",
    "\n",
    "An histogram generated with `cumulative=True` shows a running total of counts up to each bin."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})\n",
    "\n",
    "df.hvplot.hist(y='values', cumulative=True, title='Cumulative Histogram')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Overlay and layout of histogram plots\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When setting `y` to a list of variables, the object returned is an overlay of the distribution of each variable ([HoloViews NdOverlay](https://holoviews.org/reference/containers/bokeh/NdOverlay.html) object)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "\n",
    "df = hvplot.sampledata.penguins('pandas')\n",
    "\n",
    "df.hvplot.hist(y=['bill_depth_mm', 'bill_length_mm'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Setting `subplots` to `True`, the object returned is a layout ([HoloViews NdOverlay](https://holoviews.org/reference/containers/bokeh/NdLayout.html) object)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "\n",
    "df = hvplot.sampledata.penguins('pandas')\n",
    "\n",
    "df.hvplot.hist(y=['bill_depth_mm', 'bill_length_mm'], subplots=True, width=300)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`by` can also be used to generate an overlay or distribution of histograms, by setting it with categorical variable(s)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "\n",
    "df = hvplot.sampledata.penguins('pandas')\n",
    "\n",
    "df.hvplot.hist(y='body_mass_g', by='sex', alpha=0.5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.pandas  # noqa\n",
    "\n",
    "df = hvplot.sampledata.penguins('pandas')\n",
    "\n",
    "df.hvplot.hist(y='body_mass_g', by=['species', 'sex'], subplots=True, width=300).cols(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Xarray example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot.xarray  # noqa\n",
    "\n",
    "ds = hvplot.sampledata.air_temperature(\"xarray\").sel(lon=285.)\n",
    "\n",
    "ds.hvplot.hist()"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
